Plagiarism Detection Systems – an Evaluation of Several Systems

نویسنده

  • Ahmad Hashem
چکیده

In the Open University environment where students are not centrally located and are not under any direct supervision the potential for plagiarism definitely exists. The very technology that facilitates open learning also allows easy exchange of papers among peers. Students can also easily access “paper-mills” where essays can be quickly customised to suit requirements, for a fee. There is also the vast information residing on the Internet ready for creative reuse. In the light of all these temptations the Open University of Malaysia (OUM) is exploring the use of technology to educate students and deter plagiarism. One approach that appears promising is to use a good commercial plagiarism detection system. The system, by being able to detect cases of plagiarism would serve as a deterrent and hopefully contribute towards inculcating the culture of honesty. This paper presents the findings from a small study using two detection systems, a commercial Plagiarism Detection System, MyDropBox and a simple and free automatic file comparison system, Pl@giarism. The results show that while the automatic system is very useful, the simple and free system can be adequate for most purposes. Recommendations on a suitable system for the OUM context are then made based on these findings. PLAGIARISM DETECTION SYSTEMS The Internet has become a vast repository of easily accessible knowledge. Most are freely available via any Internet Search Engine while others can be accessed by members only (free or subscription). Students are availing themselves to this vast pool of knowledge. This ease of access has spawned an undesirable culture – that of the ‘cut-and-paste’. Passages from the Internet are copied and pasted to be presented as the student’s own work without substantial rewriting or due citation. The diligent teacher can look at clues such as stylistic changes to spot “cut-and-paste” portions of essays submitted by students. But maybe not all teachers are diligent or have time to be diligent, thus, to ensure uniformity in the treatment of all submitted essays some form of automatic detection of plagiarism must be employed. Deterring plagiarism and training students to abhor plagiarism would certainly contribute to improving quality in higher education. Plagiarism Detection Systems can help find sources of online passages that occur in an essay. The assumption behind all these systems is similar. Material for essays can be downloaded from freely available Internet resources as well as proprietary online databases and presented in a student’s essay. By comparing phrases in an essay with those from online resources occurrences of online material in a student’s essay can be discovered. It is important to remember when using any Plagiarism Detection System that the discovery of similar passages in an essay to an online resource must necessarily imply plagiarism can never be absolutely decided by any Plagiarism Detection System. Most systems would highlight phrases in an essay that is similar or almost similar to online resources and some sort of similarity index would be assigned depending on the extent of the similarity. Establishing plagiarism is never easy. There are issues such as intent, incorrect citation, unfamiliarity with the language and perception of what constitute plagiarism that need to be considered (Ashworth et al., 1997). Thus, after discovery it is best left to a human reader to decide on the next course of action. THE 6 SEAAIR ANNUAL CONFERENCE; Langkawi, September 5 – 7 2006 Of the many detection systems on the market that claim to help reduce the incidence of plagiarism, perhaps the most “famous” is Turnitin (www.turnitin.com). Many universities throughout the world have formed consortia to better use this software for detecting plagiarism. One such example is the Plagiarism Detection Centre, based in the University of Northumbria that has been set up by the Joint Information Systems Committee (JISC) of all the Universities in the United Kingdom (http://www.jiscpas.ac.uk/). The Research Policy and Practice Committee now require that all students’ essays be submitted to the JISC detection service. It should be noted that many of these systems are not free but are subscription-based according to several financial models. Also, issues such as intellectual property rights of students’ essays must be considered with some of these systems. A COMPARISON OF TWO PLAGIARISM DETECTION SYSTEMS At the OUM it has been felt for some time now that plagiarism cases have been on the increase. There has not been any quantitative study to support this but anecdotal evidence abound. In order to get a better understanding of the extent of plagiarism and perhaps also gain some idea on deterrent approaches a small study was carried out. A small sample of essays from several OUM Learning Centres was collected for analysis. In total 100 essays from four different subjects and from three Learning Centres were analysed. These essays were hard-copy documents that have been handed in by the students to their tutors for marking. On average an essay is about 10 A4 pages long. The essays were analysed by a typical automatic Plagiarism Detection System and in the process aspects such as ease of use, adaptability to the OUM workflow and the extent of plagiarism in the OUM were assessed. An automatic system that can detect similarities between pairs of essays in a database was also tested. There have been many studies comparing the many plagiarism detection systems. The Virtual Academic Integrity Laboratory (VAIL, 2004) report is quite comprehensive. Many reports and discussions on plagiarism detection systems can be found on the web. For this study a subscription-based detection system, MyDropBox (www.mydropbox.com), a plagiarism detection system that has received some good reviews was compared with the free and semi-automatic Pl@giarism (www.plagiarism.tk) developed by the University of Maastricht, Belgium. It is used extensively by the Law faculty there. The free system is being evaluated here because free is usually the most cost-effective. MyDropBox provides 150 days free use for testing purposes. Submitted essays are first compared automatically against a group of selected essays or against every essay in its proprietary databases, as well as resources on the Internet. MyDropBox claimed it has access to nearly 400,000 previously submitted and stored essays as well as access to certain subscriptionbased digital libraries. The MyDropBox submission process is as follows: 1. Tutor submit essays to MyDropBox 2. MyDropBox checks essays against internal collection of essays, sources on the Internet, open and subscription based online libraries and other proprietary sources. 3. MyDropBox generates a report highlighting sections of the essays with similar passages found from other sources. 4. Tutor scrutinizes report and decides whether plagiarism has taken place. Pl@giarism is a simple program that automates the process of determining similarities between pairs of essays by comparing three word phrases in each. An essay is paired with each essay in the folder in turn. Pl@giarism does not automatically check against sources on the Internet, but there is a provision for selecting and submitting one phrase at a time to an Internet search engine THE 6 SEAAIR ANNUAL CONFERENCE; Langkawi, September 5 – 7 2006 for comparison with Internet resources. Pl@giarism is a useful standalone detector for peer-topeer copying.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Evaluation of Web Plagiarism Detection Systems for Student Essays

This study uses purpose-built test data and empirical experiments to report on the performance of four web plagiarism detection systems: TurnitIn, SafeAssignment, Plagiarism-Finder and EVE. In addition to measuring accuracy of detection, we evaluated the extent to which these systems produce false detections. We obtained the test data from multiple sources and edited it in several ways to conce...

متن کامل

Plagiarism Detection - State-of-the-art systems (2016) and evaluation methods

Plagiarism detection systems comprise various approaches that aim to create a fair environment for academic publications and appropriately acknowledge the authors’ works. While the need for a reliable and performant plagiarism detection system increases with an increasing amount of publications, current systems still have shortcomings. Particularly intelligent research plagiarism detection stil...

متن کامل

UPPC - Urdu Paraphrase Plagiarism Corpus

Paraphrase plagiarism is a significant and widespread problem and research shows that it is hard to detect. Several methods and automatic systems have been proposed to deal with it. However, evaluation and comparison of such solutions is not possible because of the unavailability of benchmark corpora with manual examples of paraphrase plagiarism. To deal with this issue, we present the novel de...

متن کامل

Running head: Automatic student plagiarism detection: future perspectives AUTOMATIC STUDENT PLAGIARISM DETECTION: FUTURE PERSPECTIVES

The availability and use of computers in teaching has seen an increase in the rate of plagiarism among students because of the wide availability of electronic texts online. While computer tools that have appeared in the recent years are capable of detecting simple forms of plagiarism, such as copy-paste, a number of recent research studies devoted to evaluation and comparison of plagiarism dete...

متن کامل

An n-gram based Method for nearly Copy Detection in Plagiarism Systems

There has been plagiarism as a concept of "intellectual-propertytheft" form the time that human and artistic research activities have been created. But easy access to the web, the massive database of information and communications system in recent years has led to the issue of plagiarism as a serious issue for publishers, researchers and the research institutions. In this paper, we introduce a ...

متن کامل

Automatic Student Plagiarism Detection: Future Perspectives

The availability and use of computers in teaching has seen an increase in the rate of plagiarism among students because of the wide availability of electronic texts online. While computer tools that have appeared in recent years are capable of detecting simple forms of plagiarism, such as copy-paste, a number of recent research studies devoted to evaluation and comparison of plagiarism detectio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006